389 research outputs found

    Arguing Machines: Human Supervision of Black Box AI Systems That Make Life-Critical Decisions

    Full text link
    We consider the paradigm of a black box AI system that makes life-critical decisions. We propose an "arguing machines" framework that pairs the primary AI system with a secondary one that is independently trained to perform the same task. We show that disagreement between the two systems, without any knowledge of underlying system design or operation, is sufficient to arbitrarily improve the accuracy of the overall decision pipeline given human supervision over disagreements. We demonstrate this system in two applications: (1) an illustrative example of image classification and (2) on large-scale real-world semi-autonomous driving data. For the first application, we apply this framework to image classification achieving a reduction from 8.0% to 2.8% top-5 error on ImageNet. For the second application, we apply this framework to Tesla Autopilot and demonstrate the ability to predict 90.4% of system disengagements that were labeled by human annotators as challenging and needing human supervision

    An Initial Assessment of the Significance of Task Pacing on Self-Report and Physiological Measures of Workload While Driving

    Get PDF
    In block A of a simulator study, a sample of 38 drivers showed a stepwise increase in heart rate and skin conductance level (SCL) from single task driving and across 3 levels of an auditory presentation – verbal response dual task (n-back), replicating findings from on-road research. Subjective ratings showed a similar stepwise increase, establishing concurrent validity of the physiological indices as measures of workload. In block B, varying the inter-stimulus interval in the intermediate 1-back level of the task resulted in a pattern across self-report workload ratings, heart rate, and SCL suggesting that task pacing may influence effective workload. Further consideration of the impact of task pacing in auditoryverbal in-vehicle applications is indicated

    A Comparison of Heart Rate and Heart Rate Variability Indices in Distinguishing Single-Task Driving and Driving Under Secondary Cognitive Workload

    Get PDF
    Heart rate and heart rate variability (HRV) measures collected under actual highway driving from 25 young adults were compared to assess the relative sensitivity of each for distinguishing between a period of single task driving and periods of low and high additional cognitive workload. Basic heart rate, skin conductance and most, but not all, of the HRV indices were significantly different between single task driving and the high secondary demand period. Heart rate and skin conductance were also robust at distinguishing between single task driving and the low added demand period; however, several HRV measures did not show statistically significant differences between these two periods and the remaining HRV measures that did were less robust than basic heart rate as assessed by effect size and observed power. Rather than attempting to argue for the inherent superiority of any one physiological measure, these findings are presented with the intent of encouraging a broader discussion around the conditions under which particular physiological measures may be most useful and/or complementary for detecting different aspects of workload and operator state

    Toward an Antiphony Framework for Dividing Tasks into Subtasks

    Get PDF
    Task analysis is a staple of ergonomics, neuroergonomics, human factors, and experimental psychology inquiry, and often benefits from granularity beyond the task level to the subtask level. The concept and challenge of identifying the subcomponents of tasks are neither new, nor solved. Practitioners routinely identify individually internally consistent and yet conflicting subdivisions. The challenge of producing reliable, valid subtask data across efforts recommends a unified framework for identifying consistent subtask divisions within tasks. A framework is here forwarded, based upon universal “antiphony” turn-taking behavior in human-human interaction, but adapted to address the highly scripted vocabulary of human-machine interaction. Practical application to a real-world vehicle interface is demonstrated, an example discussed in the light of research design, applied use, and future improvement

    A Field Study Assessing Driving Performance, Visual Attention, Heart Rate and Subjective Ratings in Response to Two Types of Cognitive Workload

    Get PDF
    In an on-road experiment, driving performance, visual attention, heart rate and subjective ratings of workload were evaluated in response to a working memory (n-back) and a visual-spatial (clock) task. Subjective workload ratings for the two types of tasks did not statistically differ, suggesting a similar level of overall workload. Gaze concentration and heart rate showed significant changes relative to single task driving during the extra tasks and the magnitude of change was similar for both, while driving performance measures were not sensitive to the increase in workload. The results suggest high sensitivity of both gaze dispersion and heart rate as measures of workload across these two different types of cognitive demand
    • …
    corecore